An outfit visualization method generates an image of a person wearing real garments from images of those garments. Current methods can produce images that look realistic and preserve garment identity, captured in details such as collar, cuffs, texture, hem, and sleeve length. However, no current method can both control how the garment is worn -- including tuck or untuck, opened or closed, high or low on the waist, etc.. -- and generate realistic images that accurately preserve the properties of the original garment. We describe an outfit visualization method that controls drape while preserving garment identity. Our system allows instance independent editing of garment drape, which means a user can construct an edit (e.g. tucking a shirt in a specific way) that can be applied to all shirts in a garment collection. Garment detail is preserved by relying on a warping procedure to place the garment on the body and a generator then supplies fine shading detail. To achieve instance independent control, we use control points with garment category-level semantics to guide the warp. The method produces state-of-the-art quality images, while allowing creative ways to style garments, including allowing tops to be tucked or untucked; jackets to be worn open or closed; skirts to be worn higher or lower on the waist; and so on. The method allows interactive control to correct errors in individual renderings too. Because the edits are instance independent, they can be applied to large pools of garments automatically and can be conditioned on garment metadata (e.g. all cropped jackets are worn closed or all bomber jackets are worn closed).
translated by 谷歌翻译
模型窃取攻击带来了公共机器学习API的困境。为了保护金融投资,公司可能被迫拒绝有关其模型的重要信息,这些信息可能有助于盗窃,包括不确定性估计和预测解释。这种妥协不仅对用户有害,而且对外部透明度也有害。模型窃取防御措施试图通过使模型更难窃取,同时为良性用户保存实用程序,以解决这一难题。但是,现有的防御能力在实践中的性能较差,要么需要巨大的计算开销或严重的公用事业权衡。为了应对这些挑战,我们提出了一种新的方法来模拟窃取梯度重定向的防御措施。我们方法的核心是一种可证明的最佳,有效的算法,用于以目标方式指导对手的培训更新。结合对替代网络的改进和一种新颖的协调防御策略的改进,我们的梯度重定向防御,称为Grad $ {}^2 $,实现了小型公用事业的权衡和低计算机开销,表现出色的先前防御能力。此外,我们证明了梯度重定向如何以任意行为来重新编程对手,我们希望这能促进新的防御途径。
translated by 谷歌翻译
虽然近期拍摄图像造型化的最近进步,但这些方法无法捕捉对人类显而易见的文体细节。诸如眼睛形状的细节,线的粗糙度,对于模型来说特别困难,特别是在有限的数据设置下。在这项工作中,我们的目的是执行一个拍摄的一拍图像风格化,以获得细节。给定参考样式图像,我们使用GaN反转和Finetune使用该近似配对数据来近似配对的实际数据。然后,我们鼓励风格终体概括,以便学习风格可以应用于所有其他图像。
translated by 谷歌翻译
潜水员在NERF的关键思想和其变体 - 密度模型和体积渲染的关键思想中建立 - 学习可以从少量图像实际渲染的3D对象模型。与所有先前的NERF方法相比,潜水员使用确定性而不是体积渲染积分的随机估计。潜水员的表示是基于体素的功能领域。为了计算卷渲染积分,将光线分为间隔,每个体素;使用MLP的每个间隔的特征估计体渲染积分的组件,并且组件聚合。结果,潜水员可以呈现其他集成商错过的薄半透明结构。此外,潜水员的表示与其他这样的方法相比相对暴露的语义 - 在体素空间中的运动特征向量导致自然编辑。对当前最先进的方法的广泛定性和定量比较表明,潜水员产生(1)在最先进的质量或高于最先进的质量,(2)的情况下非常小而不会被烘烤,(3)在不被烘烤的情况下渲染非常快,并且(4)可以以自然方式编辑。
translated by 谷歌翻译
最近,由于高质量的发电和解除戒开的潜在空间,Stylegan已经启用了各种图像操纵和编辑任务。但是,通常需要额外的架构或特定于特定的培训范式来实现不同的任务。在这项工作中,我们深入了解样式甘蓝的空间属性。我们展示使用普雷雷达的样式总是以及一些操作,没有任何额外的架构,我们可以相当于各种任务的最先进的方法执行,包括图像混合,全景生成,从单个图像,可控的生成本地多模式图像到图像转换和属性传输。所提出的方法简单,有效,有效,适用于任何现有的预制样式模型。
translated by 谷歌翻译
我们展示了如何将一个对象从一个图像插入到另一个图像,并在硬情况下获得现实的结果,其中插入的对象的阴影与场景的阴影冲突。使用场景的照明模型渲染对象不起作用,因为这样做需要对象的几何和材料模型,这很难从单个图像中恢复。在本文中,我们介绍了一种方法,该方法可以纠正插入对象的阴影不一致,而无需几何和物理模型或环境图。我们的方法使用了深层图像先验(DIP),训练有素,可以通过一致的图像分解推理损耗来产生插入对象的重新添加效果。来自DIP的最终图像的目的是具有(a)类似于切割和贴合的反照率的反照率,(b)与目标场景相似的阴影场,以及(c)与切割 - 剪裁一致的阴影和帕斯特表面正常。结果是一个简单的过程,可以产生插入对象的令人信服的阴影。我们在定性和定量上对具有复杂表面特性的几个对象以及球形灯罩数据集进行定量评估的疗效。我们的方法明显优于所有这些对象的图像协调(IH)基线。在用100多名用户的用户研究中,他们还胜过剪切和IH基线。
translated by 谷歌翻译
In this paper, we propose a novel technique, namely INVALIDATOR, to automatically assess the correctness of APR-generated patches via semantic and syntactic reasoning. INVALIDATOR reasons about program semantic via program invariants while it also captures program syntax via language semantic learned from large code corpus using the pre-trained language model. Given a buggy program and the developer-patched program, INVALIDATOR infers likely invariants on both programs. Then, INVALIDATOR determines that a APR-generated patch overfits if: (1) it violates correct specifications or (2) maintains errors behaviors of the original buggy program. In case our approach fails to determine an overfitting patch based on invariants, INVALIDATOR utilizes a trained model from labeled patches to assess patch correctness based on program syntax. The benefit of INVALIDATOR is three-fold. First, INVALIDATOR is able to leverage both semantic and syntactic reasoning to enhance its discriminant capability. Second, INVALIDATOR does not require new test cases to be generated but instead only relies on the current test suite and uses invariant inference to generalize the behaviors of a program. Third, INVALIDATOR is fully automated. We have conducted our experiments on a dataset of 885 patches generated on real-world programs in Defects4J. Experiment results show that INVALIDATOR correctly classified 79% overfitting patches, accounting for 23% more overfitting patches being detected by the best baseline. INVALIDATOR also substantially outperforms the best baselines by 14% and 19% in terms of Accuracy and F-Measure, respectively.
translated by 谷歌翻译
The recent increase in public and academic interest in preserving biodiversity has led to the growth of the field of conservation technology. This field involves designing and constructing tools that utilize technology to aid in the conservation of wildlife. In this article, we will use case studies to demonstrate the importance of designing conservation tools with human-wildlife interaction in mind and provide a framework for creating successful tools. These case studies include a range of complexities, from simple cat collars to machine learning and game theory methodologies. Our goal is to introduce and inform current and future researchers in the field of conservation technology and provide references for educating the next generation of conservation technologists. Conservation technology not only has the potential to benefit biodiversity but also has broader impacts on fields such as sustainability and environmental protection. By using innovative technologies to address conservation challenges, we can find more effective and efficient solutions to protect and preserve our planet's resources.
translated by 谷歌翻译
Variational autoencoders model high-dimensional data by positing low-dimensional latent variables that are mapped through a flexible distribution parametrized by a neural network. Unfortunately, variational autoencoders often suffer from posterior collapse: the posterior of the latent variables is equal to its prior, rendering the variational autoencoder useless as a means to produce meaningful representations. Existing approaches to posterior collapse often attribute it to the use of neural networks or optimization issues due to variational approximation. In this paper, we consider posterior collapse as a problem of latent variable non-identifiability. We prove that the posterior collapses if and only if the latent variables are non-identifiable in the generative model. This fact implies that posterior collapse is not a phenomenon specific to the use of flexible distributions or approximate inference. Rather, it can occur in classical probabilistic models even with exact inference, which we also demonstrate. Based on these results, we propose a class of latent-identifiable variational autoencoders, deep generative models which enforce identifiability without sacrificing flexibility. This model class resolves the problem of latent variable non-identifiability by leveraging bijective Brenier maps and parameterizing them with input convex neural networks, without special variational inference objectives or optimization tricks. Across synthetic and real datasets, latent-identifiable variational autoencoders outperform existing methods in mitigating posterior collapse and providing meaningful representations of the data.
translated by 谷歌翻译
Cashews are grown by over 3 million smallholders in more than 40 countries worldwide as a principal source of income. As the third largest cashew producer in Africa, Benin has nearly 200,000 smallholder cashew growers contributing 15% of the country's national export earnings. However, a lack of information on where and how cashew trees grow across the country hinders decision-making that could support increased cashew production and poverty alleviation. By leveraging 2.4-m Planet Basemaps and 0.5-m aerial imagery, newly developed deep learning algorithms, and large-scale ground truth datasets, we successfully produced the first national map of cashew in Benin and characterized the expansion of cashew plantations between 2015 and 2021. In particular, we developed a SpatioTemporal Classification with Attention (STCA) model to map the distribution of cashew plantations, which can fully capture texture information from discriminative time steps during a growing season. We further developed a Clustering Augmented Self-supervised Temporal Classification (CASTC) model to distinguish high-density versus low-density cashew plantations by automatic feature extraction and optimized clustering. Results show that the STCA model has an overall accuracy of 80% and the CASTC model achieved an overall accuracy of 77.9%. We found that the cashew area in Benin has doubled from 2015 to 2021 with 60% of new plantation development coming from cropland or fallow land, while encroachment of cashew plantations into protected areas has increased by 70%. Only half of cashew plantations were high-density in 2021, suggesting high potential for intensification. Our study illustrates the power of combining high-resolution remote sensing imagery and state-of-the-art deep learning algorithms to better understand tree crops in the heterogeneous smallholder landscape.
translated by 谷歌翻译